59 research outputs found

    Prediction of amyloid fibril-forming segments based on a support vector machine

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Amyloid fibrillar aggregates of proteins or polypeptides are known to be associated with many human diseases. Recent studies suggest that short protein regions trigger this aggregation. Thus, identifying these short peptides is critical for understanding diseases and finding potential therapeutic targets.</p> <p>Results</p> <p>We propose a method, named Pafig (Prediction of amyloid fibril-forming segments) based on support vector machines, to identify the hexpeptides associated with amyloid fibrillar aggregates. The features of Pafig were obtained by a two-round selection from AAindex. Using a 10-fold cross validation test on Hexpepset dataset, Pafig performed well with regards to overall accuracy of 81% and Matthews correlation coefficient of 0.63. Pafig was used to predict the potential fibril-forming hexpeptides in all of the 64,000,000 hexpeptides. As a result, approximately 5.08% of hexpeptides showed a high aggregation propensity. In the predicted fibril-forming hexpeptides, the amino acids – alanine, phenylalanine, isoleucine, leucine and valine occurred at the higher frequencies and the amino acids – aspartic acid, glutamic acid, histidine, lysine, arginine and praline, appeared with lower frequencies.</p> <p>Conclusion</p> <p>The performance of Pafig indicates that it is a powerful tool for identifying the hexpeptides associated with fibrillar aggregates and will be useful for large-scale analysis of proteomic data.</p

    Predicting changes in protein thermostability brought about by single- or multi-site mutations

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>An important aspect of protein design is the ability to predict changes in protein thermostability arising from single- or multi-site mutations. Protein thermostability is reflected in the change in free energy (ΔΔ<it>G</it>) of thermal denaturation.</p> <p>Results</p> <p>We have developed predictive software, Prethermut, based on machine learning methods, to predict the effect of single- or multi-site mutations on protein thermostability. The input vector of Prethermut is based on known structural changes and empirical measurements of changes in potential energy due to protein mutations. Using a 10-fold cross validation test on the M-dataset, consisting of 3366 mutants proteins from ProTherm, the classification accuracy of random forests and the regression accuracy of random forest regression were slightly better than support vector machines and support vector regression, whereas the overall accuracy of classification and the Pearson correlation coefficient of regression were 79.2% and 0.72, respectively. Prethermut performs better on proteins containing multi-site mutations than those with single mutations.</p> <p>Conclusions</p> <p>The performance of Prethermut indicates that it is a useful tool for predicting changes in protein thermostability brought about by single- or multi-site mutations and will be valuable in the rational design of proteins.</p

    Predicting the phenotypic effects of non-synonymous single nucleotide polymorphisms based on support vector machines

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Human genetic variations primarily result from single nucleotide polymorphisms (SNPs) that occur approximately every 1000 bases in the overall human population. The non-synonymous SNPs (nsSNPs) that lead to amino acid changes in the protein product may account for nearly half of the known genetic variations linked to inherited human diseases. One of the key problems of medical genetics today is to identify nsSNPs that underlie disease-related phenotypes in humans. As such, the development of computational tools that can identify such nsSNPs would enhance our understanding of genetic diseases and help predict the disease.</p> <p>Results</p> <p>We propose a method, named Parepro (Predicting the amino acid replacement probability), to identify nsSNPs having either deleterious or neutral effects on the resulting protein function. Two independent datasets, HumVar and NewHumVar, taken from the PhD-SNP server, were applied to train the model and test the robustness of Parepro. Using a 20-fold cross validation test on the HumVar dataset, Parepro achieved a Matthews correlation coefficient (MCC) of 50% and an overall accuracy (Q2) of 76%, both of which were higher than those predicted by the methods, such as PolyPhen, SIFT, and HydridMeth. Further analysis on an additional dataset (NewHumVar) using Parepro yielded similar results.</p> <p>Conclusion</p> <p>The performance of Parepro indicates that it is a powerful tool for predicting the effect of nsSNPs on protein function and would be useful for large-scale analysis of genomic nsSNP data.</p

    Identification of the para-nitrophenol catabolic pathway, and characterization of three enzymes involved in the hydroquinone pathway, in pseudomonas sp. 1-7

    Get PDF
    <p>Abstract</p> <p>Background</p> <p><it>para</it>-Nitrophenol (PNP), a priority environmental pollutant, is hazardous to humans and animals. However, the information relating to the PNP degradation pathways and their enzymes remain limited.</p> <p>Results</p> <p><it>Pseudomonas </it>sp.1-7 was isolated from methyl parathion (MP)-polluted activated sludge and was shown to degrade PNP. Two different intermediates, hydroquinone (HQ) and 4-nitrocatechol (4-NC) were detected in the catabolism of PNP. This indicated that <it>Pseudomonas </it>sp.1-7 degraded PNP by two different pathways, namely the HQ pathway, and the hydroxyquinol (BT) pathway (also referred to as the 4-NC pathway). A gene cluster (<it>pdcEDGFCBA</it>) was identified in a 10.6 kb DNA fragment of a fosmid library, which cluster encoded the following enzymes involved in PNP degradation: PNP 4-monooxygenase (PdcA), <it>p</it>-benzoquinone (BQ) reductase (PdcB), hydroxyquinol (BT) 1,2-dioxygenase (PdcC), maleylacetate (MA) reductase (PdcF), 4-hydroxymuconic semialdehyde (4-HS) dehydrogenase (PdcG), and hydroquinone (HQ) 1,2-dioxygenase (PdcDE). Four genes (<it>pdcDEFG</it>) were expressed in <it>E. coli </it>and the purified <it>pdcDE</it>, <it>pdcG </it>and <it>pdcF </it>gene products were shown to convert HQ to 4-HS, 4-HS to MA and MA to β-ketoadipate respectively by <it>in vitro </it>activity assays.</p> <p>Conclusions</p> <p>The cloning, sequencing, and characterization of these genes along with the functional PNP degradation studies identified 4-NC, HQ, 4-HS, and MA as intermediates in the degradation pathway of PNP by <it>Pseudomonas </it>sp.1-7. This is the first conclusive report for both 4-NC and HQ- mediated degradation of PNP by one microorganism.</p

    Identification of the Transcriptional Regulator NcrB in the Nickel Resistance Determinant of Leptospirillum ferriphilum UBK03

    Get PDF
    The nickel resistance determinant ncrABCY was identified in Leptospirillum ferriphilum UBK03. Within this operon, ncrA and ncrC encode two membrane proteins that form an efflux system, and ncrB encodes NcrB, which belongs to an uncharacterized family (DUF156) of proteins. How this determinant is regulated remains unknown. Our data indicate that expression of the nickel resistance determinant is induced by nickel. The promoter of ncrA, designated pncrA, was cloned into the promoter probe vector pPR9TT, and co-transformed with either a wild-type or mutant nickel resistance determinant. The results revealed that ncrB encoded a transcriptional regulator that could regulate the expression of ncrA, ncrB, and ncrC. A GC-rich inverted repeat sequence was identified in the promoter pncrA. Electrophoretic mobility shift assays (EMSAs) and footprinting assays showed that purified NcrB could specifically bind to the inverted repeat sequence of pncrA in vitro; this was confirmed by bacterial one-hybrid analysis. Moreover, this binding was inhibited in the presence of nickel ions. Thus, we classified NcrB as a transcriptional regulator that recognizes the inverted repeat sequence binding motif to regulate the expression of the key nickel resistance gene, ncrA

    Presyncodon, a Web Server for Gene Design with the Evolutionary Information of the Expression Hosts

    No full text
    In the natural host, most of the synonymous codons of a gene have been evolutionarily selected and related to protein expression and function. However, for the design of a new gene, most of the existing codon optimization tools select the high-frequency-usage codons and neglect the contribution of the low-frequency-usage codons (rare codons) to the expression of the target gene in the host. In this study, we developed the method Presyncodon, available in a web version, to predict the gene code from a protein sequence, using built-in evolutionary information on a specific expression host. The synonymous codon-usage pattern of a peptide was studied from three genomic datasets (Escherichia coli, Bacillus subtilis, and Saccharomyces cerevisiae). Machine-learning models were constructed to predict a selection of synonymous codons (low- or high-frequency-usage codon) in a gene. This method could be easily and efficiently used to design new genes from protein sequences for optimal expression in three expression hosts (E. coli, B. subtilis, and S. cerevisiae). Presyncodon is free to academic and noncommercial users; accessible at http://www.mobioinfor.cn/presyncodon_www/index.html

    Improving the secretion of a methyl parathion hydrolase in Pichia pastoris by modifying its N-terminal sequence.

    No full text
    Pichia pastoris is commonly used to express and secrete target proteins, although not all recombinant proteins can be successfully produced. In this study, we used methyl parathion hydrolase (MPH) from Ochrobactrum sp. M231 as a model to study the importance of the N-terminus of the protein for its secretion. While MPH can be efficiently expressed intracellularly in P. pastoris, it is not secreted into the extracellular environment. Three MPH mutants (N66-MPH, D10-MPH, and N9-MPH) were constructed through modification of its N-terminus, and the secretion of each by P. pastoris was improved when compared to wild-type MPH. The level of secreted D10-MPH was increased to 0.21 U/mL, while that of N9-MPH was enhanced to 0.16 U/mL. Although N66-MPH was not enzymatically active, it was secreted efficiently, and was identified by SDS-PAGE. These results demonstrate that the secretion of heterologous proteins in P. pastoris may be improved by modifying their N-terminal structures
    corecore